Pseudo-Relevance Feedback Driven for XML Query Expansion
نویسندگان
چکیده
Pseudo-relevance feedback has been perceived as an effective solution for automatic query expansion. However, a recent study has shown that traditional pseudo-relevance feedback may bring into topic drift and hence be harmful to the retrieval performance. It is often crucial to identify those good feedback documents from which useful expansion terms can be added to the query. Compared with traditional query expansion, XML query expansion needs not only content expansion but also considering structural expansion. This paper presents a solution for both identifying related documents and selecting good expansion information with new content and path constrains. Combined with XML semantic feature, a naïve document similarity measurement is proposed in this paper. Based on this, kmedian clustering algorithm is firstly implemented and some related documents are found. Secondly, query expansion is only performed by two steps in the set of related documents, which key phrase extraction algorithm is carried out to expand original query in the first step and the second step is structural expansion based on the expanded key phrases. Finally a full-edged content-structure query expression which can represent user’s intention is formalized. Experimental results on IEEE CS collection show that the proposed method can reduce the topic drift effectively and obtain the better retrieval quality.
منابع مشابه
Query expansion based on relevance feedback and latent semantic analysis
Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...
متن کاملFeedback-Driven Structural Query Expansion for Ranked Retrieval of XML Data
Relevance Feedback is an important way to enhance retrieval quality by integrating relevance information provided by a user. In XML retrieval, feedback engines usually generate an expanded query from the content of elements marked as relevant or nonrelevant. This approach that is inspired by text-based IR completely ignores the semistructured nature of XML. This paper makes the important step f...
متن کاملRelevance Feedback in XML Retrieval
Highly heterogeneous XML data collections that do not have a global schema, as arising, for example, in federations of digital libraries or scientific data repositories, cannot be effectively queried with XQuery or XPath alone, but rather require a ranked retrieval approach. As known from ample work in the IR field, relevance feedback provided by the user that drives automatic query refinement ...
متن کاملRelevance Feedback for Structural Query Expansion
Keyword-based queries are an important means to retrieve information from XML collections with unknown or complex schemas. Relevance Feedback integrates relevance information provided by a user to enhance retrieval quality. For keyword-based XML queries, feedback engines usually generate an expanded keyword query from the content of elements marked as relevant or nonrelevant. This approach that...
متن کاملQuery Expansion Strategy based on Pseudo Relevance Feedback and Term Weight Scheme for Monolingual Retrieval
Query Expansion using Pseudo Relevance Feedback is a useful technique for reformulating the query. In this paper, expansion terms are obtained by combining pseudo relevance feedback and equi-frequency partition of the documents with tf-idf scoring technique. It is observed that the groups of words that have same tf-idf score as that of query terms are better candidate words for query expansion ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JCIT
دوره 5 شماره
صفحات -
تاریخ انتشار 2010